Fast DNA-PAINT imaging using a deep neural network

DNA points accumulation for imaging in nanoscale topography (DNA-PAINT) is a super-resolution technique with relatively easy-to-implement multi-target imaging. However, image acquisition is slow as sufficient statistical data has to be generated from spatio-temporally isolated single emitters. Here, we train the neural network (NN) DeepSTORM to predict fluorophore positions from high emitter density DNA-PAINT data. This achieves image acquisition in one minute. We demonstrate multi-colour super-resolution imaging of structure-conserved semi-thin neuronal tissue and imaging of large samples. This improvement can be integrated into any single-molecule imaging modality to enable fast single-molecule super-resolution microscopy.

here which uses the DeepSTORM predicted image as the reference image since it reports a higher decorrelation resolution value. By using this adapted approach, we were able to compare the images on a super-resolution scale ( Supplementary Fig. 4).
Multi-Scale Structural Similarity Index (MS-SSIM) is a combined measure of the luminance, contrast, and structure similarity between two images calculated at different image scales. 3 .

Supplementary Note 2
The minimum number of frames required to obtain a completely reconstructed image using DeepSTORM was determined by comparing predicted images generated from 50 to 2,000 frame lengths to the GT image and obtaining structural similarity values using HAWKMAN ( Supplementary Fig. 1). We found that at a concentration of 5 or 10 nM, a completely reconstructed image could be obtained with only 400 frames (dotted lines) for both α-tubulin and TOM20. The performance of the single-molecule localisation algorithm Picasso was also tested using a 10 nM TOM20 high-density dataset ( Supplementary Fig. 2) which showed incomplete reconstruction in TOM20 regions (white circles).
An α-tubulin-labelled region in tissue with 1D (left) and 2D (right) structures from the same image are compared side by side ( Supplementary Fig. 3a). The structure and sharpening map both show that 1D α-tubulin filaments are predicted well by DeepSTORM (yellow structures; yellow arrows) but the strong presence of local cyan structures (white arrows) indicates 2D α-tubulin bundles are lacking structural density in the predicted images ( Supplementary Fig.  3bc). The confidence maps corroborate these findings where low reconstruction correlation in the 2D regions and good correlation in the 1D regions are evident ( Supplementary Fig. 3d).
Image prediction similarity was assessed for α-tubulin-and TOM20-labelled images using HAWKMAN, SQUIRREL, MS-SSIM, and decorrelation resolution ( Supplementary Fig. 4). The overlay between GT and predicted images (5, 10, or 20 nM) indicate either structural agreement (white), denser GT structures (cyan) or denser predicted structures (magenta) ( Supplementary  Fig. 4a). Visual comparisons indicate that 5 nM and 10 nM predictions of α-tubulin are highly similar ( Supplementary Fig. 4a i-ii) whereas a noticeable difference is observed in 20 nM where the cyan in structurally dense regions are more prominent ( Supplementary Fig. 4a iii; yellow arrows), suggesting incomplete reconstruction of the predicted image. Here, DeepSTORM loses its prediction quality at 20 nM for dense 2D α-tubulin structures while maintaining the reconstruction of 1D structures. The comparison of a particularly dense 2D structural region of an axon 4 was chosen to observe the challenges of our NN model when applied to a high-density hotspot. GT (cyan) and predicted 5, 10 and 20 nM concentration (magenta) images were assessed ( Supplementary Fig. 4b). We found the prediction quality for extremely dense 2D structures was low with pixelated rendering artefacts, exacerbated by increasing imager strand concentrations ( Supplementary Fig. 4b i-iv). SQUIRREL error maps indicate larger errors with increasing concentration of imager strands and dissimilarities in structures can be seen in blue and green regions, while yellow regions indicate differences in intensity ( Supplementary Fig. 4b v-vii).
Visual inspection of TOM20 predicted images compared to GT indicate that an imager strand concentration of 5 nM is not sufficient to completely reproduce mitochondrial structures (high cyan density), whereas at 10 nM there is better similarity between GT and the predicted image ( Supplementary Fig. 4c i-ii). At 20 nM hallucination artefacts are being predicted in the DeepSTORM image which are not found in GT, seen as an increase in magenta structures ( Supplementary Fig. 4c iii). A magnified region of a single mitochondria shows that the GT image is finer and more punctate compared to the larger and diffuse points of the predicted images ( Supplementary Fig. 4d i-iv). While the mitochondrial structure and shape were effectively reconstructed in all three predicted imager strand concentrations, the 5 nM imager strand prediction is incomplete (yellow arrows, Supplementary Fig. 4d ii). At 10 nM, the mitochondrial shape is more defined and better reproduced, and at 20 nM hallucination artefacts (features that do not exist in GT) are formed (magenta arrows, Supplementary Fig. 4d iv). The error maps show very subtly that 10 nM imager strand concentration has the lowest structural error. Strong yellow regions dotted around the structure reflect differences in intensity rather than structural inconsistencies possibly due to differences in emitter photon intensity or degree of sampling between the datasets during image acquisition ( Supplementary Fig. 4d v-vii; white arrows; 1 . The quality of DeepSTORM predicted structures compared to GT were quantitatively assessed ( Supplementary Fig. 4e). For SQUIRREL analysis, α-tubulin showed slightly higher RSP values for 5 nM imager strand concentrations while no difference was observed for TOM20 RSP values in all imager strand concentrations. α-tubulin and TOM20 both have the lowest RSE at 5 nM (p α-tubulin = 0.02; ANOVA). This suggests that an increase in imager strand concentration contributes to higher background fluorescence which affects image prediction quality. The MS-SSIM for α-tubulin had the highest structural similarity at 5 nM imager strand whereas in TOM20-labelled structures only 20 nM was unsuitable for prediction. In the HAWKMAN analysis for both structural reconstruction and artificial sharpening, α-tubulin at 5 nM performed well (p = 0.02; ANOVA) whereas TOM20 had comparable structural correlation for 5 and 10 nM and better sharpening correlation at 10 nM. Decorrelation resolution for both α-tubulin and TOM20 are lower in all predicted images compared to their respective GT images (~35 nm) by approximately 10 nm.
We applied SQUIRREL analysis on the super-resolved low-density emitter (0.5 nM, 10,000 frames, DNA-PAINT) and high-density predicted DeepSTORM images (5, 10, 20 nM; 400 frames) against their respective diffraction-limited DNA-PAINT frames obtained by z-projection ( Supplementary Fig. 5). The low-density emitter image showed the best RSP value and the lowest RSE compared to DeepSTORM predicted images, also reflected in the error map showing high image correlation ( Supplementary Fig. 5a iii). With the increase in imager strand concentrations for predicted images, the RSP, RSE, and error map become worse. While the filamentous 1D structures on the left side of the images are largely unchanged in the error map, the prediction quality of 2D dense structures (right side) become noticeably poor ( Supplementary Fig. 5a iv-vi). Similar to α-tubulin, the low-density emitter image for TOM20 showed the best outcome with the highest RSP and lowest RSE value compared to predicted images ( Supplementary Fig. 5b iii-vi). Again, both the RSP and RSE values suffered with increasing imager strand concentrations, although the difference between 5 and 10 nM was low in mitochondrial structures. Image prediction quality suffers with increasing imager strand concentrations and may be attributed to excessive overlap of emitters and high background fluorescence. The prediction is also affected by structure dimensionality, whereby 1D structures were predicted better than 2D structures.
We sought to determine the generality of our model by extending the range of protein target prediction using 2-target Exchange-PAINT. We found that our model could predict nanostructures on a scale of~100 nm for Bassoon and Homer structures (Supplementary Fig.  7ab) and differentiate between cells in MNTB tissue such as neurons and astrocytes ( Supplementary Fig. 7cd). Furthermore, the model was stable over many months since training provided the optical setup was unchanged (Supplementary Fig. 8).
A post-processing extension of DeepSTORM functionality was developed in the ZeroCostDL4Mic platform which extracts localisations from points in the predicted DeepSTORM image. Benefits of having SMLM localisations are the ability to perform drift correction, rendering a super-resolution image with different algorithms, and performing coordinate-based image analysis. We briefly studied the localisation output of the post-processing function in the Colab notebook. An experimental and artificial high-density TOM20 dataset was used for image prediction in DeepSTORM and subsequently post-processed to extract DeepSTORM localisations. The localisations were rendered in Picasso using the same rendering method used for the GT. The image similarity of GT, DeepSTORM predicted image, and DeepSTORM localisations rendered in Picasso were compared using HAWKMAN and MS-SSIM ( Supplementary Fig. 10). In general, there was a very slight increase by 0.02 in the similarity metrics of GT vs DeepSTORM localisations in the experimental dataset ( Supplementary Fig.  10a). Although this difference is negligible, generally an improvement in the image analysis metrics can be attributed to the similar rendering method in Picasso using the One-Pixel Blur.
Supplementary Figure 1: Optimization of frame length for complete image reconstruction using DeepSTORM. High density frames at 5 (red), 10 (blue), and 20 (cyan) nM imager strand concentrations at frame lengths of 50, 100, 200, 400, 600, 1,000, and 2,000 were predicted with DeepSTORM and the predicted image similarity evaluated against GT images. HAWKMAN image similarity metric was applied to the whole image to determine the minimum number of frames required for complete image reconstruction. Vertical stippled line marks frame length at 400; n = 3 images per data point, error bars = SD. Source data are provided as a Source Data file.